Overview

Brought to you by YData

Dataset statistics

Number of variables19
Number of observations19459
Missing cells11482
Missing cells (%)3.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.9 MiB
Average record size in memory482.0 B

Variable types

Categorical8
Text3
Numeric8

Alerts

followers is highly overall correlated with following and 6 other fieldsHigh correlation
following is highly overall correlated with followers and 4 other fieldsHigh correlation
label is highly overall correlated with text_bot_countHigh correlation
log_followers is highly overall correlated with followers and 6 other fieldsHigh correlation
log_following is highly overall correlated with followers and 4 other fieldsHigh correlation
log_public_gists is highly overall correlated with followers and 4 other fieldsHigh correlation
log_public_repos is highly overall correlated with followers and 6 other fieldsHigh correlation
public_gists is highly overall correlated with followers and 4 other fieldsHigh correlation
public_repos is highly overall correlated with followers and 6 other fieldsHigh correlation
text_bot_count is highly overall correlated with label and 1 other fieldsHigh correlation
type is highly overall correlated with text_bot_countHigh correlation
label is highly imbalanced (67.2%) Imbalance
type is highly imbalanced (92.8%) Imbalance
site_admin is highly imbalanced (95.7%) Imbalance
text_bot_count is highly imbalanced (93.5%) Imbalance
bio has 10758 (55.3%) missing values Missing
public_repos is highly skewed (γ1 = 54.52274234) Skewed
public_gists is highly skewed (γ1 = 69.95022964) Skewed
followers is highly skewed (γ1 = 32.0725488) Skewed
following is highly skewed (γ1 = 30.4418986) Skewed
public_repos has 966 (5.0%) zeros Zeros
public_gists has 7790 (40.0%) zeros Zeros
followers has 1436 (7.4%) zeros Zeros
following has 5901 (30.3%) zeros Zeros
log_public_repos has 966 (5.0%) zeros Zeros
log_public_gists has 7790 (40.0%) zeros Zeros
log_followers has 1436 (7.4%) zeros Zeros
log_following has 5901 (30.3%) zeros Zeros

Reproduction

Analysis started2025-01-04 14:34:33.728733
Analysis finished2025-01-04 14:34:42.078433
Duration8.35 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

label
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
Human
18287 
Bot
 
1172

Length

Max length5
Median length5
Mean length4.8795416
Min length3

Characters and Unicode

Total characters94951
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHuman
2nd rowHuman
3rd rowHuman
4th rowBot
5th rowHuman

Common Values

ValueCountFrequency (%)
Human 18287
94.0%
Bot 1172
 
6.0%

Length

2025-01-04T22:34:42.136630image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-04T22:34:42.220572image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
human 18287
94.0%
bot 1172
 
6.0%

Most occurring characters

ValueCountFrequency (%)
H 18287
19.3%
u 18287
19.3%
m 18287
19.3%
a 18287
19.3%
n 18287
19.3%
B 1172
 
1.2%
o 1172
 
1.2%
t 1172
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 75492
79.5%
Uppercase Letter 19459
 
20.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 18287
24.2%
m 18287
24.2%
a 18287
24.2%
n 18287
24.2%
o 1172
 
1.6%
t 1172
 
1.6%
Uppercase Letter
ValueCountFrequency (%)
H 18287
94.0%
B 1172
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 94951
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 18287
19.3%
u 18287
19.3%
m 18287
19.3%
a 18287
19.3%
n 18287
19.3%
B 1172
 
1.2%
o 1172
 
1.2%
t 1172
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 94951
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 18287
19.3%
u 18287
19.3%
m 18287
19.3%
a 18287
19.3%
n 18287
19.3%
B 1172
 
1.2%
o 1172
 
1.2%
t 1172
 
1.2%

type
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
1
19289 
0
 
170

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters19459
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 19289
99.1%
0 170
 
0.9%

Length

2025-01-04T22:34:42.278569image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-04T22:34:42.331980image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 19289
99.1%
0 170
 
0.9%

Most occurring characters

ValueCountFrequency (%)
1 19289
99.1%
0 170
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19459
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 19289
99.1%
0 170
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common 19459
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 19289
99.1%
0 170
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19459
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 19289
99.1%
0 170
 
0.9%

site_admin
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
0
19369 
1
 
90

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters19459
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 19369
99.5%
1 90
 
0.5%

Length

2025-01-04T22:34:42.391200image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-04T22:34:42.447931image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 19369
99.5%
1 90
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 19369
99.5%
1 90
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19459
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 19369
99.5%
1 90
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common 19459
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 19369
99.5%
1 90
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19459
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 19369
99.5%
1 90
 
0.5%

company
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
1
10635 
0
8824 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters19459
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1 10635
54.7%
0 8824
45.3%

Length

2025-01-04T22:34:42.508346image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-04T22:34:42.565241image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 10635
54.7%
0 8824
45.3%

Most occurring characters

ValueCountFrequency (%)
1 10635
54.7%
0 8824
45.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19459
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 10635
54.7%
0 8824
45.3%

Most occurring scripts

ValueCountFrequency (%)
Common 19459
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 10635
54.7%
0 8824
45.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19459
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 10635
54.7%
0 8824
45.3%

blog
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
0
11084 
1
8375 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters19459
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 11084
57.0%
1 8375
43.0%

Length

2025-01-04T22:34:42.628070image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-04T22:34:42.688272image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 11084
57.0%
1 8375
43.0%

Most occurring characters

ValueCountFrequency (%)
0 11084
57.0%
1 8375
43.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19459
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 11084
57.0%
1 8375
43.0%

Most occurring scripts

ValueCountFrequency (%)
Common 19459
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 11084
57.0%
1 8375
43.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19459
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 11084
57.0%
1 8375
43.0%

location
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
1
12502 
0
6957 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters19459
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 12502
64.2%
0 6957
35.8%

Length

2025-01-04T22:34:42.749673image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-04T22:34:42.807710image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 12502
64.2%
0 6957
35.8%

Most occurring characters

ValueCountFrequency (%)
1 12502
64.2%
0 6957
35.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19459
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 12502
64.2%
0 6957
35.8%

Most occurring scripts

ValueCountFrequency (%)
Common 19459
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 12502
64.2%
0 6957
35.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19459
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 12502
64.2%
0 6957
35.8%

hireable
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
0
16228 
1
3231 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters19459
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 16228
83.4%
1 3231
 
16.6%

Length

2025-01-04T22:34:42.866823image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-04T22:34:42.927282image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 16228
83.4%
1 3231
 
16.6%

Most occurring characters

ValueCountFrequency (%)
0 16228
83.4%
1 3231
 
16.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19459
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 16228
83.4%
1 3231
 
16.6%

Most occurring scripts

ValueCountFrequency (%)
Common 19459
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 16228
83.4%
1 3231
 
16.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19459
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 16228
83.4%
1 3231
 
16.6%

bio
Text

Missing 

Distinct8511
Distinct (%)97.8%
Missing10758
Missing (%)55.3%
Memory size2.2 MiB
2025-01-04T22:34:43.154524image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length3013
Median length311
Mean length66.395817
Min length1

Characters and Unicode

Total characters577710
Distinct characters195
Distinct categories17 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8446 ?
Unique (%)97.1%

Sample

1st rowI just press the buttons randomly, and the program evolves...
2nd rowTime is unimportant, only life important.
3rd rowDone studying. Need challenges.
4th rowAdministrator of MOONGIFT that is introducing open source software everyday to Japanese engineers since 2004.
5th rowSenior Software Engineer at Google, working on Certificate Transparency and generalized transparency.
ValueCountFrequency (%)
3017
 
3.9%
and 2499
 
3.2%
engineer 1568
 
2.0%
software 1493
 
1.9%
of 1469
 
1.9%
at 1372
 
1.8%
developer 1210
 
1.6%
the 1073
 
1.4%
a 1031
 
1.3%
i 1021
 
1.3%
Other values (15629) 62170
79.8%
2025-01-04T22:34:43.545780image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
69483
 
12.0%
e 50777
 
8.8%
a 32167
 
5.6%
o 32151
 
5.6%
r 31682
 
5.5%
n 31423
 
5.4%
t 30985
 
5.4%
i 28328
 
4.9%
s 20313
 
3.5%
l 15440
 
2.7%
Other values (185) 234961
40.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 398013
68.9%
Space Separator 69488
 
12.0%
Uppercase Letter 49508
 
8.6%
Other Punctuation 29722
 
5.1%
Decimal Number 14833
 
2.6%
Dash Punctuation 3735
 
0.6%
Math Symbol 3721
 
0.6%
Control 3047
 
0.5%
Open Punctuation 2229
 
0.4%
Private Use 1459
 
0.3%
Other values (7) 1955
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 50777
12.8%
a 32167
 
8.1%
o 32151
 
8.1%
r 31682
 
8.0%
n 31423
 
7.9%
t 30985
 
7.8%
i 28328
 
7.1%
s 20313
 
5.1%
l 15440
 
3.9%
c 14098
 
3.5%
Other values (49) 110649
27.8%
Uppercase Letter
ValueCountFrequency (%)
S 5632
 
11.4%
C 3765
 
7.6%
E 2996
 
6.1%
I 2911
 
5.9%
T 2822
 
5.7%
P 2821
 
5.7%
D 2722
 
5.5%
A 2711
 
5.5%
F 2513
 
5.1%
M 2315
 
4.7%
Other values (42) 18300
37.0%
Other Punctuation
ValueCountFrequency (%)
, 9327
31.4%
. 7615
25.6%
@ 4145
13.9%
: 2362
 
7.9%
/ 1983
 
6.7%
? 853
 
2.9%
' 744
 
2.5%
& 655
 
2.2%
! 380
 
1.3%
# 305
 
1.0%
Other values (13) 1353
 
4.6%
Math Symbol
ValueCountFrequency (%)
| 1127
30.3%
+ 904
24.3%
384
 
10.3%
¬ 162
 
4.4%
155
 
4.2%
148
 
4.0%
119
 
3.2%
119
 
3.2%
118
 
3.2%
± 88
 
2.4%
Other values (9) 397
 
10.7%
Decimal Number
ValueCountFrequency (%)
0 4832
32.6%
2 2341
15.8%
1 2299
15.5%
3 1208
 
8.1%
5 825
 
5.6%
4 822
 
5.5%
9 737
 
5.0%
6 641
 
4.3%
8 591
 
4.0%
7 537
 
3.6%
Open Punctuation
ValueCountFrequency (%)
1226
55.0%
( 530
23.8%
397
 
17.8%
[ 57
 
2.6%
{ 19
 
0.9%
Other Symbol
ValueCountFrequency (%)
® 192
39.9%
° 100
20.8%
© 97
20.2%
78
16.2%
14
 
2.9%
Currency Symbol
ValueCountFrequency (%)
¥ 74
40.2%
£ 57
31.0%
¢ 40
21.7%
$ 13
 
7.1%
Modifier Symbol
ValueCountFrequency (%)
¨ 62
47.0%
´ 53
40.2%
` 14
 
10.6%
^ 3
 
2.3%
Dash Punctuation
ValueCountFrequency (%)
- 3491
93.5%
157
 
4.2%
87
 
2.3%
Close Punctuation
ValueCountFrequency (%)
) 549
87.8%
] 58
 
9.3%
} 18
 
2.9%
Space Separator
ValueCountFrequency (%)
69483
> 99.9%
  5
 
< 0.1%
Other Letter
ValueCountFrequency (%)
º 206
53.9%
ª 176
46.1%
Control
ValueCountFrequency (%)
3047
100.0%
Private Use
ValueCountFrequency (%)
1459
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 145
100.0%
Final Punctuation
ValueCountFrequency (%)
» 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 447652
77.5%
Common 128408
 
22.2%
Unknown 1459
 
0.3%
Greek 191
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 50777
 
11.3%
a 32167
 
7.2%
o 32151
 
7.2%
r 31682
 
7.1%
n 31423
 
7.0%
t 30985
 
6.9%
i 28328
 
6.3%
s 20313
 
4.5%
l 15440
 
3.4%
c 14098
 
3.1%
Other values (100) 160288
35.8%
Common
ValueCountFrequency (%)
69483
54.1%
, 9327
 
7.3%
. 7615
 
5.9%
0 4832
 
3.8%
@ 4145
 
3.2%
- 3491
 
2.7%
3047
 
2.4%
: 2362
 
1.8%
2 2341
 
1.8%
1 2299
 
1.8%
Other values (72) 19466
 
15.2%
Greek
ValueCountFrequency (%)
Ω 112
58.6%
π 79
41.4%
Unknown
ValueCountFrequency (%)
1459
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 560027
96.9%
None 12464
 
2.2%
Punctuation 2400
 
0.4%
PUA 1459
 
0.3%
Math Operators 1268
 
0.2%
Letterlike Symbols 78
 
< 0.1%
Geometric Shapes 14
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
69483
 
12.4%
e 50777
 
9.1%
a 32167
 
5.7%
o 32151
 
5.7%
r 31682
 
5.7%
n 31423
 
5.6%
t 30985
 
5.5%
i 28328
 
5.1%
s 20313
 
3.6%
l 15440
 
2.8%
Other values (86) 217278
38.8%
None
ValueCountFrequency (%)
ü 1509
 
12.1%
Ä 992
 
8.0%
 438
 
3.5%
ç 364
 
2.9%
ñ 364
 
2.9%
Ô 350
 
2.8%
è 337
 
2.7%
Å 323
 
2.6%
Ê 274
 
2.2%
Á 264
 
2.1%
Other values (66) 7249
58.2%
PUA
ValueCountFrequency (%)
1459
100.0%
Punctuation
ValueCountFrequency (%)
1226
51.1%
397
 
16.5%
239
 
10.0%
157
 
6.5%
147
 
6.1%
134
 
5.6%
87
 
3.6%
8
 
0.3%
5
 
0.2%
Math Operators
ValueCountFrequency (%)
384
30.3%
155
12.2%
148
 
11.7%
119
 
9.4%
119
 
9.4%
118
 
9.3%
85
 
6.7%
70
 
5.5%
57
 
4.5%
12
 
0.9%
Letterlike Symbols
ValueCountFrequency (%)
78
100.0%
Geometric Shapes
ValueCountFrequency (%)
14
100.0%

public_repos
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct667
Distinct (%)3.4%
Missing46
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean83.189461
Minimum0
Maximum50000
Zeros966
Zeros (%)5.0%
Negative0
Negative (%)0.0%
Memory size820.1 KiB
2025-01-04T22:34:43.672813image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q111
median34
Q382
95-th percentile248
Maximum50000
Range50000
Interquartile range (IQR)71

Descriptive statistics

Standard deviation575.11725
Coefficient of variation (CV)6.9133427
Kurtosis3754.597
Mean83.189461
Median Absolute Deviation (MAD)28
Skewness54.522742
Sum1614957
Variance330759.85
MonotonicityNot monotonic
2025-01-04T22:34:43.759187image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 966
 
5.0%
1 556
 
2.9%
2 468
 
2.4%
3 394
 
2.0%
4 377
 
1.9%
6 362
 
1.9%
5 350
 
1.8%
7 328
 
1.7%
9 306
 
1.6%
8 300
 
1.5%
Other values (657) 15006
77.1%
ValueCountFrequency (%)
0 966
5.0%
1 556
2.9%
2 468
2.4%
3 394
2.0%
4 377
 
1.9%
5 350
 
1.8%
6 362
 
1.9%
7 328
 
1.7%
8 300
 
1.5%
9 306
 
1.6%
ValueCountFrequency (%)
50000 1
< 0.1%
27746 1
< 0.1%
26360 1
< 0.1%
22618 1
< 0.1%
20693 1
< 0.1%
17425 1
< 0.1%
16985 1
< 0.1%
16839 1
< 0.1%
9554 1
< 0.1%
7068 1
< 0.1%

public_gists
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct390
Distinct (%)2.0%
Missing38
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean28.5216
Minimum0
Maximum55781
Zeros7790
Zeros (%)40.0%
Negative0
Negative (%)0.0%
Memory size820.1 KiB
2025-01-04T22:34:43.863976image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q310
95-th percentile70
Maximum55781
Range55781
Interquartile range (IQR)10

Descriptive statistics

Standard deviation653.4023
Coefficient of variation (CV)22.909034
Kurtosis5437.9943
Mean28.5216
Median Absolute Deviation (MAD)2
Skewness69.95023
Sum553918
Variance426934.57
MonotonicityNot monotonic
2025-01-04T22:34:43.948669image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7790
40.0%
1 1818
 
9.3%
2 1118
 
5.7%
3 804
 
4.1%
4 652
 
3.4%
5 610
 
3.1%
6 473
 
2.4%
7 398
 
2.0%
9 323
 
1.7%
8 311
 
1.6%
Other values (380) 5124
26.3%
ValueCountFrequency (%)
0 7790
40.0%
1 1818
 
9.3%
2 1118
 
5.7%
3 804
 
4.1%
4 652
 
3.4%
5 610
 
3.1%
6 473
 
2.4%
7 398
 
2.0%
8 311
 
1.6%
9 323
 
1.7%
ValueCountFrequency (%)
55781 1
< 0.1%
53660 1
< 0.1%
28943 1
< 0.1%
26879 1
< 0.1%
15482 1
< 0.1%
12328 1
< 0.1%
10604 1
< 0.1%
8924 1
< 0.1%
4461 1
< 0.1%
4163 1
< 0.1%

followers
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct1564
Distinct (%)8.0%
Missing30
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean244.56544
Minimum0
Maximum95752
Zeros1436
Zeros (%)7.4%
Negative0
Negative (%)0.0%
Memory size820.1 KiB
2025-01-04T22:34:44.031018image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17
median33
Q3124
95-th percentile822.6
Maximum95752
Range95752
Interquartile range (IQR)117

Descriptive statistics

Standard deviation1555.0957
Coefficient of variation (CV)6.3586076
Kurtosis1525.2067
Mean244.56544
Median Absolute Deviation (MAD)31
Skewness32.072549
Sum4751662
Variance2418322.6
MonotonicityNot monotonic
2025-01-04T22:34:44.115662image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1436
 
7.4%
1 796
 
4.1%
2 616
 
3.2%
3 504
 
2.6%
4 443
 
2.3%
5 409
 
2.1%
6 396
 
2.0%
7 342
 
1.8%
8 337
 
1.7%
9 308
 
1.6%
Other values (1554) 13842
71.1%
ValueCountFrequency (%)
0 1436
7.4%
1 796
4.1%
2 616
3.2%
3 504
 
2.6%
4 443
 
2.3%
5 409
 
2.1%
6 396
 
2.0%
7 342
 
1.8%
8 337
 
1.7%
9 308
 
1.6%
ValueCountFrequency (%)
95752 1
< 0.1%
84979 1
< 0.1%
66203 1
< 0.1%
58452 1
< 0.1%
31120 1
< 0.1%
30287 1
< 0.1%
29719 1
< 0.1%
29414 1
< 0.1%
28411 1
< 0.1%
27775 1
< 0.1%

following
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct607
Distinct (%)3.1%
Missing151
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean42.653045
Minimum0
Maximum16741
Zeros5901
Zeros (%)30.3%
Negative0
Negative (%)0.0%
Memory size820.1 KiB
2025-01-04T22:34:44.196059image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median4
Q322
95-th percentile146
Maximum16741
Range16741
Interquartile range (IQR)22

Descriptive statistics

Standard deviation309.26905
Coefficient of variation (CV)7.2508081
Kurtosis1228.9191
Mean42.653045
Median Absolute Deviation (MAD)4
Skewness30.441899
Sum823545
Variance95647.344
MonotonicityNot monotonic
2025-01-04T22:34:44.282580image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5901
30.3%
1 1709
 
8.8%
2 1073
 
5.5%
3 774
 
4.0%
4 596
 
3.1%
5 518
 
2.7%
6 470
 
2.4%
7 399
 
2.1%
8 357
 
1.8%
9 317
 
1.6%
Other values (597) 7194
37.0%
ValueCountFrequency (%)
0 5901
30.3%
1 1709
 
8.8%
2 1073
 
5.5%
3 774
 
4.0%
4 596
 
3.1%
5 518
 
2.7%
6 470
 
2.4%
7 399
 
2.1%
8 357
 
1.8%
9 317
 
1.6%
ValueCountFrequency (%)
16741 1
< 0.1%
15931 1
< 0.1%
11921 1
< 0.1%
10268 1
< 0.1%
9720 1
< 0.1%
9686 1
< 0.1%
9532 1
< 0.1%
9367 1
< 0.1%
7374 1
< 0.1%
6207 1
< 0.1%
Distinct19434
Distinct (%)> 99.9%
Missing23
Missing (%)0.1%
Memory size2.0 MiB
2025-01-04T22:34:44.512480image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length27
Median length25
Mean length24.974532
Min length1

Characters and Unicode

Total characters485405
Distinct characters35
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19432 ?
Unique (%)> 99.9%

Sample

1st row2011-09-26 17:27:03+00:00
2nd row2015-06-29 10:12:46+00:00
3rd row2008-08-29 16:20:03+00:00
4th row2014-05-20 18:43:09+00:00
5th row2012-08-16 14:19:13+00:00
ValueCountFrequency (%)
2012-07-05 18
 
< 0.1%
2014-10-08 15
 
< 0.1%
2013-06-10 15
 
< 0.1%
2017-06-09 14
 
< 0.1%
2013-01-25 14
 
< 0.1%
2014-05-16 14
 
< 0.1%
2013-01-15 13
 
< 0.1%
2014-04-15 13
 
< 0.1%
2012-06-18 13
 
< 0.1%
2013-05-14 13
 
< 0.1%
Other values (21976) 38709
99.6%
2025-01-04T22:34:44.806726image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 144298
29.7%
: 58239
12.0%
1 57550
 
11.9%
2 50192
 
10.3%
- 38826
 
8.0%
3 19515
 
4.0%
19417
 
4.0%
+ 19413
 
4.0%
4 17545
 
3.6%
5 17230
 
3.5%
Other values (25) 43180
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 349479
72.0%
Other Punctuation 58239
 
12.0%
Dash Punctuation 38826
 
8.0%
Space Separator 19417
 
4.0%
Math Symbol 19413
 
4.0%
Lowercase Letter 27
 
< 0.1%
Uppercase Letter 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 4
14.8%
e 3
11.1%
a 2
 
7.4%
i 2
 
7.4%
p 2
 
7.4%
u 2
 
7.4%
s 2
 
7.4%
k 1
 
3.7%
c 1
 
3.7%
t 1
 
3.7%
Other values (7) 7
25.9%
Decimal Number
ValueCountFrequency (%)
0 144298
41.3%
1 57550
 
16.5%
2 50192
 
14.4%
3 19515
 
5.6%
4 17545
 
5.0%
5 17230
 
4.9%
9 11170
 
3.2%
8 10933
 
3.1%
7 10541
 
3.0%
6 10505
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
S 1
25.0%
P 1
25.0%
C 1
25.0%
E 1
25.0%
Other Punctuation
ValueCountFrequency (%)
: 58239
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 38826
100.0%
Space Separator
ValueCountFrequency (%)
19417
100.0%
Math Symbol
ValueCountFrequency (%)
+ 19413
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 485374
> 99.9%
Latin 31
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 4
 
12.9%
e 3
 
9.7%
a 2
 
6.5%
i 2
 
6.5%
p 2
 
6.5%
u 2
 
6.5%
s 2
 
6.5%
k 1
 
3.2%
S 1
 
3.2%
c 1
 
3.2%
Other values (11) 11
35.5%
Common
ValueCountFrequency (%)
0 144298
29.7%
: 58239
12.0%
1 57550
 
11.9%
2 50192
 
10.3%
- 38826
 
8.0%
3 19515
 
4.0%
19417
 
4.0%
+ 19413
 
4.0%
4 17545
 
3.6%
5 17230
 
3.5%
Other values (4) 43149
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 485405
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 144298
29.7%
: 58239
12.0%
1 57550
 
11.9%
2 50192
 
10.3%
- 38826
 
8.0%
3 19515
 
4.0%
19417
 
4.0%
+ 19413
 
4.0%
4 17545
 
3.6%
5 17230
 
3.5%
Other values (25) 43180
 
8.9%
Distinct19181
Distinct (%)98.7%
Missing23
Missing (%)0.1%
Memory size2.0 MiB
2025-01-04T22:34:45.039789image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length25
Median length25
Mean length24.829337
Min length1

Characters and Unicode

Total characters482583
Distinct characters34
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19056 ?
Unique (%)98.0%

Sample

1st row2023-10-13 11:21:10+00:00
2nd row2023-10-07 06:26:14+00:00
3rd row2023-10-02 02:11:21+00:00
4th row2023-10-12 12:54:59+00:00
5th row2023-10-06 11:58:41+00:00
ValueCountFrequency (%)
2023-10-13 837
 
2.2%
2023-10-11 813
 
2.1%
2023-10-12 809
 
2.1%
2023-10-10 716
 
1.8%
2023-10-09 653
 
1.7%
2023-10-04 530
 
1.4%
2023-10-03 474
 
1.2%
2023-10-05 463
 
1.2%
2023-10-06 460
 
1.2%
2023-09-28 433
 
1.1%
Other values (17627) 32546
84.0%
2025-01-04T22:34:45.336945image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 142136
29.5%
2 62535
13.0%
: 57888
12.0%
1 43291
 
9.0%
- 38592
 
8.0%
3 33261
 
6.9%
19300
 
4.0%
+ 19296
 
4.0%
4 13659
 
2.8%
5 13559
 
2.8%
Other values (24) 39066
 
8.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 347479
72.0%
Other Punctuation 57891
 
12.0%
Dash Punctuation 38592
 
8.0%
Space Separator 19300
 
4.0%
Math Symbol 19296
 
4.0%
Lowercase Letter 21
 
< 0.1%
Uppercase Letter 3
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
14.3%
o 2
9.5%
t 2
9.5%
l 2
9.5%
e 2
9.5%
v 2
9.5%
y 1
 
4.8%
g 1
 
4.8%
d 1
 
4.8%
i 1
 
4.8%
Other values (4) 4
19.0%
Decimal Number
ValueCountFrequency (%)
0 142136
40.9%
2 62535
18.0%
1 43291
 
12.5%
3 33261
 
9.6%
4 13659
 
3.9%
5 13559
 
3.9%
9 13040
 
3.8%
8 9649
 
2.8%
7 8491
 
2.4%
6 7858
 
2.3%
Other Punctuation
ValueCountFrequency (%)
: 57888
> 99.9%
. 2
 
< 0.1%
" 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
J 1
33.3%
M 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 38592
100.0%
Space Separator
ValueCountFrequency (%)
19300
100.0%
Math Symbol
ValueCountFrequency (%)
+ 19296
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 482559
> 99.9%
Latin 24
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 142136
29.5%
2 62535
13.0%
: 57888
12.0%
1 43291
 
9.0%
- 38592
 
8.0%
3 33261
 
6.9%
19300
 
4.0%
+ 19296
 
4.0%
4 13659
 
2.8%
5 13559
 
2.8%
Other values (7) 39042
 
8.1%
Latin
ValueCountFrequency (%)
a 3
12.5%
o 2
 
8.3%
t 2
 
8.3%
l 2
 
8.3%
e 2
 
8.3%
v 2
 
8.3%
P 1
 
4.2%
y 1
 
4.2%
g 1
 
4.2%
d 1
 
4.2%
Other values (7) 7
29.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 482583
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 142136
29.5%
2 62535
13.0%
: 57888
12.0%
1 43291
 
9.0%
- 38592
 
8.0%
3 33261
 
6.9%
19300
 
4.0%
+ 19296
 
4.0%
4 13659
 
2.8%
5 13559
 
2.8%
Other values (24) 39066
 
8.1%

text_bot_count
Categorical

High correlation  Imbalance 

Distinct28
Distinct (%)0.1%
Missing148
Missing (%)0.8%
Memory size1.6 MiB
0
18537 
1
 
420
2
 
245
3
 
73
4
 
9
Other values (23)
 
27

Length

Max length25
Median length1
Mean length1.0205064
Min length1

Characters and Unicode

Total characters19707
Distinct characters18
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 18537
95.3%
1 420
 
2.2%
2 245
 
1.3%
3 73
 
0.4%
4 9
 
< 0.1%
5 5
 
< 0.1%
246 1
 
< 0.1%
35 1
 
< 0.1%
2014-02-27 00:18:12+00:00 1
 
< 0.1%
2023-10-07 17:17:23+00:00 1
 
< 0.1%
Other values (18) 18
 
0.1%
(Missing) 148
 
0.8%

Length

2025-01-04T22:34:45.432972image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 18537
95.9%
1 420
 
2.2%
2 245
 
1.3%
3 73
 
0.4%
4 9
 
< 0.1%
5 5
 
< 0.1%
2023-10-06 2
 
< 0.1%
35 1
 
< 0.1%
2014-02-27 1
 
< 0.1%
00:18:12+00:00 1
 
< 0.1%
Other values (33) 33
 
0.2%

Most occurring characters

ValueCountFrequency (%)
0 18655
94.7%
1 468
 
2.4%
2 288
 
1.5%
3 95
 
0.5%
: 48
 
0.2%
- 32
 
0.2%
4 22
 
0.1%
5 19
 
0.1%
17
 
0.1%
+ 16
 
0.1%
Other values (8) 47
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19590
99.4%
Other Punctuation 48
 
0.2%
Dash Punctuation 32
 
0.2%
Space Separator 17
 
0.1%
Math Symbol 16
 
0.1%
Lowercase Letter 3
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18655
95.2%
1 468
 
2.4%
2 288
 
1.5%
3 95
 
0.5%
4 22
 
0.1%
5 19
 
0.1%
7 12
 
0.1%
8 11
 
0.1%
9 11
 
0.1%
6 9
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
u 1
33.3%
s 1
33.3%
t 1
33.3%
Other Punctuation
ValueCountFrequency (%)
: 48
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 32
100.0%
Space Separator
ValueCountFrequency (%)
17
100.0%
Math Symbol
ValueCountFrequency (%)
+ 16
100.0%
Uppercase Letter
ValueCountFrequency (%)
R 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 19703
> 99.9%
Latin 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18655
94.7%
1 468
 
2.4%
2 288
 
1.5%
3 95
 
0.5%
: 48
 
0.2%
- 32
 
0.2%
4 22
 
0.1%
5 19
 
0.1%
17
 
0.1%
+ 16
 
0.1%
Other values (4) 43
 
0.2%
Latin
ValueCountFrequency (%)
R 1
25.0%
u 1
25.0%
s 1
25.0%
t 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19707
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18655
94.7%
1 468
 
2.4%
2 288
 
1.5%
3 95
 
0.5%
: 48
 
0.2%
- 32
 
0.2%
4 22
 
0.1%
5 19
 
0.1%
17
 
0.1%
+ 16
 
0.1%
Other values (8) 47
 
0.2%

log_public_repos
Real number (ℝ)

High correlation  Zeros 

Distinct667
Distinct (%)3.4%
Missing46
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean3.3764337
Minimum0
Maximum10.819798
Zeros966
Zeros (%)5.0%
Negative0
Negative (%)0.0%
Memory size820.1 KiB
2025-01-04T22:34:45.505061image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.69314718
Q12.4849066
median3.5553481
Q34.4188406
95-th percentile5.5174529
Maximum10.819798
Range10.819798
Interquartile range (IQR)1.933934

Descriptive statistics

Standard deviation1.4886652
Coefficient of variation (CV)0.44089868
Kurtosis0.025800282
Mean3.3764337
Median Absolute Deviation (MAD)0.93328831
Skewness-0.38278236
Sum65546.708
Variance2.216124
MonotonicityNot monotonic
2025-01-04T22:34:45.584695image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 966
 
5.0%
0.6931471806 556
 
2.9%
1.098612289 468
 
2.4%
1.386294361 394
 
2.0%
1.609437912 377
 
1.9%
1.945910149 362
 
1.9%
1.791759469 350
 
1.8%
2.079441542 328
 
1.7%
2.302585093 306
 
1.6%
2.197224577 300
 
1.5%
Other values (657) 15006
77.1%
ValueCountFrequency (%)
0 966
5.0%
0.6931471806 556
2.9%
1.098612289 468
2.4%
1.386294361 394
2.0%
1.609437912 377
 
1.9%
1.791759469 350
 
1.8%
1.945910149 362
 
1.9%
2.079441542 328
 
1.7%
2.197224577 300
 
1.5%
2.302585093 306
 
1.6%
ValueCountFrequency (%)
10.81979828 1
< 0.1%
10.23088301 1
< 0.1%
10.17964092 1
< 0.1%
10.02654554 1
< 0.1%
9.937599082 1
< 0.1%
9.765718623 1
< 0.1%
9.740144754 1
< 0.1%
9.731512288 1
< 0.1%
9.164819857 1
< 0.1%
8.863474306 1
< 0.1%

log_public_gists
Real number (ℝ)

High correlation  Zeros 

Distinct390
Distinct (%)2.0%
Missing38
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean1.3872489
Minimum0
Maximum10.929207
Zeros7790
Zeros (%)40.0%
Negative0
Negative (%)0.0%
Memory size820.1 KiB
2025-01-04T22:34:45.661531image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1.0986123
Q32.3978953
95-th percentile4.2626799
Maximum10.929207
Range10.929207
Interquartile range (IQR)2.3978953

Descriptive statistics

Standard deviation1.5188762
Coefficient of variation (CV)1.0948837
Kurtosis0.40792863
Mean1.3872489
Median Absolute Deviation (MAD)1.0986123
Skewness0.96025069
Sum26941.762
Variance2.306985
MonotonicityNot monotonic
2025-01-04T22:34:45.742305image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7790
40.0%
0.6931471806 1818
 
9.3%
1.098612289 1118
 
5.7%
1.386294361 804
 
4.1%
1.609437912 652
 
3.4%
1.791759469 610
 
3.1%
1.945910149 473
 
2.4%
2.079441542 398
 
2.0%
2.302585093 323
 
1.7%
2.197224577 311
 
1.6%
Other values (380) 5124
26.3%
ValueCountFrequency (%)
0 7790
40.0%
0.6931471806 1818
 
9.3%
1.098612289 1118
 
5.7%
1.386294361 804
 
4.1%
1.609437912 652
 
3.4%
1.791759469 610
 
3.1%
1.945910149 473
 
2.4%
2.079441542 398
 
2.0%
2.197224577 311
 
1.6%
2.302585093 323
 
1.7%
ValueCountFrequency (%)
10.92920652 1
< 0.1%
10.89044176 1
< 0.1%
10.27311821 1
< 0.1%
10.19913779 1
< 0.1%
9.647497927 1
< 0.1%
9.41970949 1
< 0.1%
9.269080867 1
< 0.1%
9.096611607 1
< 0.1%
8.403352375 1
< 0.1%
8.33423143 1
< 0.1%

log_followers
Real number (ℝ)

High correlation  Zeros 

Distinct1564
Distinct (%)8.0%
Missing30
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean3.490025
Minimum0
Maximum11.469527
Zeros1436
Zeros (%)7.4%
Negative0
Negative (%)0.0%
Memory size820.1 KiB
2025-01-04T22:34:45.817739image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12.0794415
median3.5263605
Q34.8283137
95-th percentile6.7136848
Maximum11.469527
Range11.469527
Interquartile range (IQR)2.7488722

Descriptive statistics

Standard deviation1.9544797
Coefficient of variation (CV)0.56001883
Kurtosis-0.28827067
Mean3.490025
Median Absolute Deviation (MAD)1.3291359
Skewness0.13372599
Sum67807.696
Variance3.819991
MonotonicityNot monotonic
2025-01-04T22:34:45.902911image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1436
 
7.4%
0.6931471806 796
 
4.1%
1.098612289 616
 
3.2%
1.386294361 504
 
2.6%
1.609437912 443
 
2.3%
1.791759469 409
 
2.1%
1.945910149 396
 
2.0%
2.079441542 342
 
1.8%
2.197224577 337
 
1.7%
2.302585093 308
 
1.6%
Other values (1554) 13842
71.1%
ValueCountFrequency (%)
0 1436
7.4%
0.6931471806 796
4.1%
1.098612289 616
3.2%
1.386294361 504
 
2.6%
1.609437912 443
 
2.3%
1.791759469 409
 
2.1%
1.945910149 396
 
2.0%
2.079441542 342
 
1.8%
2.197224577 337
 
1.7%
2.302585093 308
 
1.6%
ValueCountFrequency (%)
11.46952724 1
< 0.1%
11.35017121 1
< 0.1%
11.10049616 1
< 0.1%
10.97597829 1
< 0.1%
10.34563811 1
< 0.1%
10.31850687 1
< 0.1%
10.2995755 1
< 0.1%
10.28926003 1
< 0.1%
10.25456687 1
< 0.1%
10.23192762 1
< 0.1%

log_following
Real number (ℝ)

High correlation  Zeros 

Distinct607
Distinct (%)3.1%
Missing151
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean1.8481194
Minimum0
Maximum9.7256758
Zeros5901
Zeros (%)30.3%
Negative0
Negative (%)0.0%
Memory size820.1 KiB
2025-01-04T22:34:45.988614image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1.6094379
Q33.1354942
95-th percentile4.9904326
Maximum9.7256758
Range9.7256758
Interquartile range (IQR)3.1354942

Descriptive statistics

Standard deviation1.7374152
Coefficient of variation (CV)0.94009897
Kurtosis-0.25776983
Mean1.8481194
Median Absolute Deviation (MAD)1.6094379
Skewness0.68501686
Sum35683.49
Variance3.0186115
MonotonicityNot monotonic
2025-01-04T22:34:46.191602image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5901
30.3%
0.6931471806 1709
 
8.8%
1.098612289 1073
 
5.5%
1.386294361 774
 
4.0%
1.609437912 596
 
3.1%
1.791759469 518
 
2.7%
1.945910149 470
 
2.4%
2.079441542 399
 
2.1%
2.197224577 357
 
1.8%
2.302585093 317
 
1.6%
Other values (597) 7194
37.0%
ValueCountFrequency (%)
0 5901
30.3%
0.6931471806 1709
 
8.8%
1.098612289 1073
 
5.5%
1.386294361 774
 
4.0%
1.609437912 596
 
3.1%
1.791759469 518
 
2.7%
1.945910149 470
 
2.4%
2.079441542 399
 
2.1%
2.197224577 357
 
1.8%
2.302585093 317
 
1.6%
ValueCountFrequency (%)
9.725675811 1
< 0.1%
9.676084944 1
< 0.1%
9.386140712 1
< 0.1%
9.236884927 1
< 0.1%
9.182043773 1
< 0.1%
9.178540059 1
< 0.1%
9.162514742 1
< 0.1%
9.145054905 1
< 0.1%
8.905851181 1
< 0.1%
8.733594062 1
< 0.1%

Interactions

2025-01-04T22:34:41.061606image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.000074image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.493821image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.995686image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:36.486357image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.560798image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.030483image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.532067image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:41.144250image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.071084image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.555237image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:36.054257image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:36.547962image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.617224image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.089141image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.593544image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:41.214548image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.129206image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.611737image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:36.115834image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:36.619171image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.677573image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.146925image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.655715image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:41.281643image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.184973image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.684506image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:36.170547image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.238876image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.742728image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.205100image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.731381image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:41.348167image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.247900image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.743812image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:36.231816image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.301499image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.801764image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.288236image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.800878image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:41.411493image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.309516image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.815363image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:36.292411image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.363348image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.861465image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.349878image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.859978image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:41.476518image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.366963image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.878217image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:36.356363image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.430116image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.914807image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.416017image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.918687image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:41.545861image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.429470image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:35.935771image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:36.424479image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.494254image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:39.976027image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.471607image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-04T22:34:40.980079image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2025-01-04T22:34:46.486701image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
blogcompanyfollowersfollowinghireablelabellocationlog_followerslog_followinglog_public_gistslog_public_repospublic_gistspublic_repossite_admintext_bot_counttype
blog1.0000.2580.0470.0360.2180.0230.3700.4240.3580.3590.3600.0170.0000.0050.0650.081
company0.2581.0000.0170.0050.0580.0700.3930.2570.1950.1810.1960.0000.0090.0250.0700.102
followers0.0470.0171.0000.5350.0000.0000.0201.0000.5350.5920.6490.5920.6490.0000.0000.000
following0.0360.0050.5351.0000.0500.0000.0080.5351.0000.4390.5360.4390.5360.0000.0880.000
hireable0.2180.0580.0000.0501.0000.0560.1770.2130.2660.1990.2260.0000.0150.0130.0610.041
label0.0230.0700.0000.0000.0561.0000.1280.1630.1640.1390.3620.0350.0190.0060.5780.370
location0.3700.3930.0200.0080.1770.1281.0000.3940.3570.2900.3490.0000.0000.0190.1290.125
log_followers0.4240.2571.0000.5350.2130.1630.3941.0000.5350.5920.6490.5920.6490.0750.0590.226
log_following0.3580.1950.5351.0000.2660.1640.3570.5351.0000.4390.5360.4390.5360.0000.0940.115
log_public_gists0.3590.1810.5920.4390.1990.1390.2900.5920.4391.0000.6191.0000.6190.0380.0600.092
log_public_repos0.3600.1960.6490.5360.2260.3620.3490.6490.5360.6191.0000.6191.0000.0190.1790.322
public_gists0.0170.0000.5920.4390.0000.0350.0000.5920.4391.0000.6191.0000.6190.0000.0370.000
public_repos0.0000.0090.6490.5360.0150.0190.0000.6490.5360.6191.0000.6191.0000.0000.0370.000
site_admin0.0050.0250.0000.0000.0130.0060.0190.0750.0000.0380.0190.0000.0001.0000.0000.000
text_bot_count0.0650.0700.0000.0880.0610.5780.1290.0590.0940.0600.1790.0370.0370.0001.0000.511
type0.0810.1020.0000.0000.0410.3700.1250.2260.1150.0920.3220.0000.0000.0000.5111.000

Missing values

2025-01-04T22:34:41.636604image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2025-01-04T22:34:41.806131image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-01-04T22:34:41.966236image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

labeltypesite_admincompanybloglocationhireablebiopublic_repospublic_gistsfollowersfollowingcreated_atupdated_attext_bot_countlog_public_reposlog_public_gistslog_followerslog_following
0Human100000NaN26.01.05.01.02011-09-26 17:27:03+00:002023-10-13 11:21:10+00:0003.2958370.6931471.7917590.693147
1Human100101I just press the buttons randomly, and the program evolves...30.03.09.06.02015-06-29 10:12:46+00:002023-10-07 06:26:14+00:0003.4339871.3862942.3025851.945910
2Human101111Time is unimportant,\nonly life important.103.049.01212.0221.02008-08-29 16:20:03+00:002023-10-02 02:11:21+00:0004.6443913.9120237.1008525.402677
3Bot100010NaN49.00.084.02.02014-05-20 18:43:09+00:002023-10-12 12:54:59+00:0003.9120230.0000004.4426511.098612
4Human100001NaN11.01.06.02.02012-08-16 14:19:13+00:002023-10-06 11:58:41+00:0002.4849070.6931471.9459101.098612
5Human101110Done studying. Need challenges.56.01.022.07.02017-04-11 14:08:07+00:002023-10-11 05:59:26+00:0004.0430510.6931473.1354942.079442
6Human101111Administrator of MOONGIFT that is introducing open source software everyday to Japanese engineers since 2004.277.01139.063.016.02008-04-07 22:22:22+00:002023-09-27 09:04:56+00:0005.6276217.0387844.1588832.833213
7Human101010Senior Software Engineer at Google, working on Certificate Transparency and generalized transparency.37.01.022.00.02012-01-19 21:57:07+00:002023-08-07 16:06:34+00:0003.6375860.6931473.1354940.000000
8Human100000NaN27.02.037.0596.02019-12-24 20:04:33+00:002023-10-12 11:55:01+00:0003.3322051.0986123.6375866.391917
9Human101110Hi42.09.014.02.02013-07-23 23:29:34+00:002023-10-09 20:47:05+00:0003.7612002.3025852.7080501.098612
labeltypesite_admincompanybloglocationhireablebiopublic_repospublic_gistsfollowersfollowingcreated_atupdated_attext_bot_countlog_public_reposlog_public_gistslog_followerslog_following
19503Human101010NaN30.00.010.011.02016-09-10 09:45:00+00:002023-10-06 11:30:51+00:0003.4339870.0000002.3978952.484907
19504Human100011NaN37.019.091.06.02012-04-19 03:27:14+00:002023-10-07 18:13:52+00:0003.6375862.9957324.5217891.945910
19505Bot100000I am the bot account of @alvaroaleman1.00.00.00.02018-12-15 19:55:31+00:002021-07-27 14:14:25+00:0020.6931470.0000000.0000000.000000
19506Human100000NaN3.00.01.00.02013-11-10 16:05:37+00:002023-08-31 14:26:08+00:0021.3862940.0000000.6931470.000000
19507Human100000NaN0.00.00.00.02020-10-01 18:30:32+00:002020-12-29 19:45:12+00:0000.0000000.0000000.0000000.000000
19508Bot101110Tony came to Linux in 1994 and has never looked back. His entire professional career has been spent working with or on Linux. First as a systems administrator36.016.011.04.02014-07-02 23:27:34+00:002023-08-15 16:38:34+00:0003.6109182.8332132.4849071.609438
19509Human100000NaN16.00.03.00.02017-12-06 21:56:31+00:002023-07-26 18:32:25+00:0002.8332130.0000001.3862940.000000
19510Human101010Software engineer at RealTracs.13.00.010.01.02015-11-14 14:44:05+00:002022-08-23 21:09:49+00:0002.6390570.0000002.3978950.693147
19511Human101000NaN7.00.02.00.02021-11-23 18:55:29+00:002023-10-06 22:50:45+00:0002.0794420.0000001.0986120.000000
19512Bot100010NaN10.00.01.00.02016-04-22 22:11:59+00:002022-07-07 19:48:21+00:0002.3978950.0000000.6931470.000000